Minimizing the Communication Time for Matrix Multiplication on Multiprocessors
نویسنده
چکیده
We present one matrix multiplication algorithm for two{dimensional arrays of processing nodes, and one algorithm for three{dimensional nodal arrays. One{dimensional nodal arrays are treated as a degenerate case. The algorithms are designed to utilize fully the communications bandwidth in high degree networks in which the one{, two{, or three{dimensional arrays may be embedded. For binary n-cubes, our algorithms ooer a speedup of the communication over previous algorithms for square matrices and square two{dimensional arrays by a factor of n 2. Connguring the N = 2 n processing nodes as a three-dimensional array may reduce the communication complexity by a factor of N 1 6 compared to a two{dimensional nodal array. The three{dimensional algorithm requires temporary storage proportional to the length of the nodal array axis aligned with the axis shared between the multiplier and the multiplicand. The optimal two{dimensional nodal array shape with respect to communication has a ratio between the numbers of node rows and columns equal to the ratio between the numbers of matrix rows and columns of the product matrix, with the product matrix accumulated in{place. The optimal three{dimensional nodal array shape has a ratio between the lengths of the machine axes equal approximately to the ratio between the lengths of the three axes in matrix multiplication. For product matrices of extreme shape, one{dimensional nodal array shapes are optimal when N=n < columns of the product matrix. All our algorithms use standard communication functions.
منابع مشابه
Parallel Matrix Multiplication Algorithms on Hypercube Multiprocessors 1
In this paper, we present three parallel algorithms for matrix multiplication. The rst one, which employs pipelining techniques on a mesh grid, uses only one copy of data matrices. The second one uses multiple copies of data matrices also on a mesh grid. Although data communication operations of the second algorithm are reduced, the requirement of local data memory for each processing element i...
متن کاملModified 32-Bit Shift-Add Multiplier Design for Low Power Application
Multiplication is a basic operation in any signal processing application. Multiplication is the most important one among the four arithmetic operations like addition, subtraction, and division. Multipliers are usually hardware intensive, and the main parameters of concern are high speed, low cost, and less VLSI area. The propagation time and power consumption in the multiplier are always high. ...
متن کاملA New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure
The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...
متن کاملMatrix-Matrix Multiplications and Fault Tolerance on Hypercube Multiprocessors
Several new algorithms for matrix-matrix multiplications on hypercube multiprocessors are presented and evaluated based on the number of multiplications, additions, and transfers. The matrices ~I be multiplied are uniformly distributed to all processors of a hypercube system. Each processor owns some submatrices which are derived by dividing the source matrices. Each submatrix multiplication ca...
متن کاملSegmented Operations for Sparse Matrix Computation on Vector Multiprocessors
In this paper we present a new technique for sparse matrix multiplication on vector multiprocessors based on the efficient implementation of a segmented sum operation. We describe how the segmented sum can be implemented on vector multiprocessors such that it both fully vectorizes within each processor and parallelizes across processors. Because of our method’s insensitivity to relative row siz...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Parallel Computing
دوره 19 شماره
صفحات -
تاریخ انتشار 1993